Filtering Degenerate Patterns with Application to Protein Sequence Analysis

نویسندگان

  • Matteo Comin
  • Davide Verzotto
چکیده

In biology, the notion of degenerate pattern plays a central role for describing various phenomena. For example, protein active site patterns, like those contained in the PROSITE database, e.g., [FY ]DPC[LIM ][ASG]C[ASG], are, in general, represented by degenerate patterns with character classes. Researchers have developed several approaches over the years to discover degenerate patterns. Although these methods have been exhaustively and successfully tested on genomes and proteins, their outcomes often far exceed the size of the original input, making the output hard to be managed and to be interpreted by refined analysis requiring manual inspection. In this paper, we discuss a characterization of degenerate patterns with character classes, without gaps, and we introduce the concept of pattern priority for comparing and ranking different patterns. We define the class of underlying patterns for filtering any set of degenerate patterns into a new set that is linear in the size of the input sequence. We present some preliminary results on the detection of subtle signals in protein families. Results show that our approach drastically reduces the number of patterns in output for a tool for protein analysis, while retaining the representative patterns.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Designing Of Degenerate Primers-Based Polymerase Chain Reaction (PCR) For Amplification Of WD40 Repeat-Containing Proteins Using Local Allignment Search Method

Degenerate primers-based polymerase chain reaction (PCR) are commonly used for isolation of unidentified gene sequences in related organisms. For designing the degenerate primers, we propose the use of local alignment search method for searching the conserved regions long enough to design an acceptable primer pair. To test this method, a WD40 repeat-containing domain protein from Beauveria bass...

متن کامل

Quantum current modelling on tri-layer graphene nanoribbons in limit degenerate and non-degenerate

Graphene is determined by a wonderful carrier transport property and high sensitivityat the surface of a single molecule, making them great as resources used in Nano electronic use.TGN is modeled in form of three honeycomb lattices with pairs of in-equivalent sites as {A1, B1},{A2, B2}, and {A3, B3} which are located in the top, center and bottom layers, respectively. Trilayer...

متن کامل

Isolation of the Gene Coding for Movement Protein from Grapevine Fanleaf Virus

A pair of degenerate primers, GMPF1 and GMPR1, was designed on the basis of alignment of previously reported Grapevine fanleaf virus (GFLV) movement protein (MP) nucleotide sequences from Iran and other parts of the world. cDNA was synthesized by the use of Oligo d(T)18 from total RNA extraction from each diseased grapevine leaf sample and subjected to polymerase chain reaction (PCR) with the d...

متن کامل

Application of Single-Frequency Time-Space Filtering Technique for Seismic Ground Roll and Random Noise Attenuation

Time-frequency filtering is an acceptable technique for attenuating noise in 2-D (time-space) and 3-D (time-space-space) reflection seismic data. The common approach for this purpose is transforming each seismic signal from 1-D time domain to a 2-D time-frequency domain and then denoising the signal by a designed filter and finally transforming back the filtered signal to original time domain. ...

متن کامل

iProsite: an improved prosite database achieved by replacing ambiguous positions with more informative representations

PROSITE database contains a set of entries corresponding to protein families, which are used to identify the family of a protein from its sequence. Although patterns and profiles are developed to be very selective, each may have false positive or negative hits. Considering false positives as items that reduce the selectiveness of a pattern, then, the more selective pattern we have, a more accur...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Algorithms

دوره 6  شماره 

صفحات  -

تاریخ انتشار 2013